Market Basket Analysis

Executive Summary

Introduction

Danielle has asked the team to perform a market basket analysis to help Blackwell’s board of directors better understand the clientele that Electronidex is currently serving and if Electronidex would be an optimal acquisition.

A dataset of transactions has been provided. The dataset contains 9835 transactions and 125 different products over a 30-day period, or about 327 transactions a day. This tells us the retailer is neither large, nor small.

Results, conclusions and recommendations

After the analysis, we conclude Electronidex’s sales are to be categorized in two forms: retail (B2C) and corporate (B2B). Had we had this information previous to our analysis, it would have saved time in the exploration phase. Interesting patterns and item relationships found:

Retail by products:

lhs Var.2 rhs
{Samsung Monitor} => {CYBERPOWER Gamer Desktop}
{CYBERPOWER Gamer Desktop} => {Apple Earpods}
{Apple Earpods} => {CYBERPOWER Gamer Desktop}

Retail by category:

lhs Var.2 rhs
4 {Accessories,Keyboard} => {Desktop}
5 {Accessories,Keyboard} => {Desktop}
6 {Computer Cords,Mouse and Keyboard Combo} => {Laptops}

Corporate by products:

lhs Var.2 rhs
7 {HP Laptop} => {iMac}
8 {ViewSonic Monitor} => {iMac}
9 {iMac} => {HP Laptop}

Corporate by categories:

lhs Var.2 rhs
10 {Laptops} => {Desktop}
11 {Laptops,Monitors} => {Desktop}
12 {Desktop} => {Laptops}

Would Blackwell benefit in selling Electronidex’s items? As a conclusion, we do recommend that Blackwell acquires Electronidex for the following reasons:

  • Product portfolio diversification and customer base expansion: Having previously analyzed the current products that Blackwell supplies to its customers, we believe that new product categories will help reach new customers. Additionally, all clients acquired with Eltronidex will become part of Blackwell’s customer base. The combination of diversifaction and expansion will consequentially increase Blackwell’s revenue.

  • Beneficial Product Association: Since Blackwell currently works with close to 50% of brands shared by Electronidex; Blackwell can take advantage of the discovered associations. This will imply the average amount of Blackwell’s items per transaction will be increased.

  • New potential strategies to adopt by Blackwell’s sales team: The aforementioned results opens new marketing possibilities. Extracting the associations with high confidence and low support (sold less often), it is possible to boost sales for those products (through advertisement and other sales actions). The increase in sales number of the said products will bring in increased revenue since we are confident the new transactions will increase sales volume. Blackwell could create some item packs of those items that are more often purchased at the same transaction, in order to also increase sales volume. Additionally, the rules that we found can be also helpful to provide recommendations to users when they purchase through the e-commerce platorm.

Limitations and observations:

Properties of the dataset:

  1. The iMac is the product most bought, in 20% of all transactions. This high number stands out considering the large variety of products, especially being the iMac a pricey product. If this number is representative of all sales throughout the year, then Electronidex is potentially profitable.
  2. The mean of items bought per transaction is almost 5. Logically, we would say most people in the real world would buy 1 or 2 items per transactions most frequently, in an eletronics store.

Technical Report

Preprocessing

transactions as itemMatrix in sparse format with
 9835 rows (elements/itemsets/transactions) and
 125 columns (items) and a density of 0.03506172 

most frequent items:
                    iMac                HP Laptop CYBERPOWER Gamer Desktop 
                    2519                     1909                     1809 
           Apple Earpods        Apple MacBook Air                  (Other) 
                    1715                     1530                    33622 

element (itemset/transaction) length distribution:
sizes
   0    1    2    3    4    5    6    7    8    9   10   11   12   13   14 
   2 2163 1647 1294 1021  856  646  540  439  353  247  171  119   77   72 
  15   16   17   18   19   20   21   22   23   25   26   27   29   30 
  56   41   26   20   10   10   10    5    3    1    1    3    1    1 

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  0.000   2.000   3.000   4.383   6.000  30.000 

includes extended item information - examples:
                            labels
1 1TB Portable External Hard Drive
2 2TB Portable External Hard Drive
3                   3-Button Mouse

Feature Engineering

# Counting number of items
nitems <- c()
for (i in 1:nrow(trans_df)) {
    nitems <- c(nitems, sum(trans_df[i, ]))
}
trans_df$nitems <- nitems
trans_df$laptops <- trans_df[, which(colnames(trans_df) == "LG Touchscreen Laptop")] + 
    trans_df[, which(colnames(trans_df) == "Acer Aspire")] + 
trans_df[, which(colnames(trans_df) == "HP Laptop")] + trans_df[, which(colnames(trans_df) == 
    "ASUS Chromebook")] + trans_df[, which(colnames(trans_df) == "Apple Macbook Pro")] + 
    trans_df[, which(colnames(trans_df) == "Apple MacBook Air")] + trans_df[, 
    which(colnames(trans_df) == "Dell Laptop")] + trans_df[, which(colnames(trans_df) == 
    "Eluktronics Pro Gaming Laptop")] + trans_df[, which(colnames(trans_df) == 
    "Alienware AW17R4-7345SLV-PUS 17\" Laptop")] + trans_df[, which(colnames(trans_df) == 
    "HP Notebook Touchscreen Laptop PC")]
trans_df$desktop <- trans_df[, which(colnames(trans_df) == "Lenovo Desktop Computer")] + 
    trans_df[, which(colnames(trans_df) == "iMac")] + trans_df[, which(colnames(trans_df) == 
    "HP Desktop")] + trans_df[, which(colnames(trans_df) == "ASUS Desktop")] + 
    trans_df[, which(colnames(trans_df) == "Dell Desktop")] + trans_df[, which(colnames(trans_df) == 
    "Intel Desktop")] + trans_df[, which(colnames(trans_df) == "Acer Desktop")] + 
    trans_df[, which(colnames(trans_df) == "CYBERPOWER Gamer Desktop")] + trans_df[, 
    which(colnames(trans_df) == "Dell 2 Desktop")]
trans_df$tablet <- trans_df[, which(colnames(trans_df) == "iPad")] + trans_df[, 
    which(colnames(trans_df) == "iPad Pro")] + trans_df[, which(colnames(trans_df) == 
    "Fire HD Tablet")] + trans_df[, which(colnames(trans_df) == "Samsung Galaxy Tab")] + 
    trans_df[, which(colnames(trans_df) == "Kindle")]
trans_df$printer <- trans_df$`Epson Printer` + trans_df$`HP Wireless Printer` + 
    trans_df$`Canon Office Printer` + trans_df$`Brother Printer` + trans_df$`DYMO Label Manker`
trans_df$nmain <- trans_df$printer + trans_df$laptops + trans_df$desktop + trans_df$tablet
trans_df$ncomp <- trans_df$nitems - trans_df$nmain
trans_df$value <- 10 * trans_df$nmain + trans_df$ncomp

Creating rules via apriori algorithm

Rules for products in corporate transactions

rules support confidence lift count average
773 {HP Laptop} => {iMac} 0.1273350 0.4267662 1.132930 743 0.8051123
743 {ViewSonic Monitor} => {iMac} 0.0815767 0.4803229 1.275106 476 0.5709258
774 {iMac} => {HP Laptop} 0.1273350 0.3380346 1.132930 743 0.5000222
771 {Lenovo Desktop Computer} => {iMac} 0.0990574 0.4355690 1.156299 578 0.3368490
761 {Dell Desktop} => {iMac} 0.0920308 0.4372964 1.160885 537 0.2143682
772 {iMac} => {Lenovo Desktop Computer} 0.0990574 0.2629663 1.156299 578 -0.2566185
744 {iMac} => {ViewSonic Monitor} 0.0815767 0.2165605 1.275106 476 -0.3359806
762 {iMac} => {Dell Desktop} 0.0920308 0.2443130 1.160885 537 -0.4491756
767 {CYBERPOWER Gamer Desktop} => {iMac} 0.0956298 0.3798502 1.008383 558 -0.4761647
768 {iMac} => {CYBERPOWER Gamer Desktop} 0.0956298 0.2538672 1.008383 558 -0.9093381

Rules for categories in corporate transactions

rules support confidence lift count average
234 {Laptops} => {Desktop} 0.5407027 0.7879620 1.008059 3155 0.6754051
1489 {Laptops,Monitors} => {Desktop} 0.3076264 0.8100181 1.036276 1795 0.5741039
235 {Desktop} => {Laptops} 0.5407027 0.6917343 1.008059 3155 0.3617015
232 {Monitors} => {Desktop} 0.4327335 0.7868495 1.006636 2525 0.2558999
1490 {Desktop,Monitors} => {Laptops} 0.3076264 0.7108911 1.035976 1795 0.2440085
230 {Monitors} => {Laptops} 0.3797772 0.6905578 1.006345 2216 -0.2525664
1491 {Desktop,Laptops} => {Monitors} 0.3076264 0.5689382 1.034514 1795 -0.2526320
228 {Computer Mice} => {Desktop} 0.2904884 0.7821874 1.000672 1695 -0.4018954
233 {Desktop} => {Monitors} 0.4327335 0.5536067 1.006636 2525 -0.5044744
231 {Laptops} => {Monitors} 0.3797772 0.5534466 1.006345 2216 -0.6995507

Rules for products in retailers transactions

rules support confidence lift count average
{CYBERPOWER Gamer Desktop} => {Apple Earpods} 0.01425 0.1676471 1.333177 57 NaN
{Apple Earpods} => {CYBERPOWER Gamer Desktop} 0.01425 0.1133201 1.333177 57 NaN

Rules for categories in retailers transactions

rules support confidence lift count average
25 {Monitors} => {Desktop} 0.05375 0.3272451 1.1452145 215 1.5705548
26 {Desktop} => {Monitors} 0.05375 0.1881015 1.1452145 215 0.9887284
17 {Computer Mice} => {Desktop} 0.03825 0.2875940 1.0064531 153 0.3956863
11 {Keyboard} => {Desktop} 0.03625 0.2859961 1.0008611 145 0.2838193
18 {Desktop} => {Computer Mice} 0.03825 0.1338583 1.0064531 153 -0.2471569
19 {Active Headphones} => {Laptops} 0.03625 0.2449324 0.7825317 145 -0.3314923
12 {Desktop} => {Keyboard} 0.03625 0.1268591 1.0008611 145 -0.3816090
23 {Monitors} => {Laptops} 0.03725 0.2267884 0.7245637 149 -0.4782297
20 {Laptops} => {Active Headphones} 0.03625 0.1158147 0.7825317 145 -0.8713958
24 {Laptops} => {Monitors} 0.03725 0.1190096 0.7245637 149 -0.9289051

Andreu Oros, Sergi Pallice, Joël Ribera

2019-07-26